Kim, Su Nam and Timothy Baldwin (2008) An Unsupervised Approach to Interpreting Noun Compounds, In Proceedings of 2008 IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE'08), Beijing, China

نویسندگان

  • Su Nam Kim
  • Timothy Baldwin
چکیده

This paper proposes an unsupervised approach to automatically interpret noun compounds using semantic similarity. Our proposed unsupervised method is based on obtaining a large amount of robust evidence for NC interpretation. In order to obtain evidence sentences for semantic relations (SRs), we first acquired sentences containing both a head noun and its modifier in the form of SR definitions. Then we determined the semantic relations represented in the sentences by looking at the nouns in the test instances (noun mapping) and verbs in the SR definitions (verb mapping). In the noun mapping, we measured the similarity between nouns in test instances and nouns in the collected sentences. In the verb mapping, we mapped the verbs of sentences onto those in the SR definitions. Finally, we built a statistical classifier to interpret noun compounds and evaluated it over 17 SRs defined in [1].

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Kim, Su Nam and Timothy Baldwin (2008) Benchmarking Noun Compound Interpretation, In Proceedings of the Third International Joint Conference on Natural Language Processing (IJCNLP 2008), Hyderabad, India

In this paper we provide benchmark results for two classes of methods used in interpreting noun compounds (NCs): semantic similarity-based methods and their hybrids. We evaluate the methods using 7-way and binary class data from the nominal pair interpretation task of SEMEVAL-2007.1 We summarize and analyse our results, with the intention of providing a framework for benchmarking future researc...

متن کامل

Kim, Su Nam and Timothy Baldwin (to appear) Word Sense Disambiguation and Noun Compounds, ACM Transactions on Speech and Language Processing

In this paper, we investigate word sense distributions in noun compounds (NCs). Our primary goal is to disambiguate the word sense of component words in NCs, based on investigation of “semantic collocation” between them. We use sense collocation and lexical substitution to build supervised and unsupervised word sense disambiguation (WSD) classifiers, and show our unsupervised learner to be supe...

متن کامل

Kim, Su Nam and Timothy Baldwin (2013) A Lexical Semantic Approach to Interpreting and Bracketing English Noun Compounds, Natural Language Engineering 19(3), pp. 385-407

This paper presents a study on the interpretation and bracketing of noun compounds (“NCs”), based on lexical semantics. Our primary goal is to develop a method to automatically interpret NCs through the use of semantic relations. Our NC interpretation method is based on lexical similarity with tagged NCs, based on lexical similarity measures derived fromWordNet. We apply the interpretation meth...

متن کامل

Baldwin, Timothy, Su Nam Kim, Francis Bond, Sanae Fujita, David Martinez and Takaaki Tanaka (2008) MRD-based Word Sense Disambiguation: Further Extending Lesk, In Proceedings of the Third International Joint Conference on Natural Language Processing (IJCNLP 2008), Hyderabad, India

This paper reconsiders the task of MRDbased word sense disambiguation, in extending the basic Lesk algorithm to investigate the impact onWSD performance of different tokenisation schemes, scoring mechanisms, methods of gloss extension and filtering methods. In experimentation over the Lexeed Sensebank and the Japanese Senseval2 dictionary task, we demonstrate that character bigrams with sense-s...

متن کامل

An Unsupervised Approach to Domain-Specific Term Extraction

Domain-specific terms provide vital semantic information for many natural language processing (NLP) tasks and applications, but remain a largely untapped resource in the field. In this paper, we propose an unsupervised method to extract domain-specific terms from the Reuters document collection using term frequency and inverse document frequency.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008